A Boundedness Theoretical Analysis for GrADPDesign : A Case Study on Maze Navigation Report

نویسندگان

  • Haibo He
  • Zhen Ni
  • Xiangnan Zhong
چکیده

A Boundedness Theoretical Analysis for GrADPDesign: A Case Study on Maze Navigation Report Title A new theoretical analysis towards the goal representation adaptive dynamic programming (GrADP) design proposed in [1], [2] is investigated in this paper. Unlike the proofs of convergence for adaptive dynamic programming (ADP) in literature, here we provide a new insight for the error bound between the estimated value function and the expected value function. Then we employ the critic network in GrADP approach to approximate the Q value function, and use the action network to provide the control policy. The goal network is adopted to provide the internal reinforcement signal for the critic network over time. Finally, we illustrate that the estimated Q value function is close to the expected value function in an arbitrary small bound on the maze navigation example. Conference Name: Proc. Int. Joint Conf. Neural Networks (IJCNN'15) Conference Date: July 13, 2015 A Boundedness Theoretical Analysis for GrADP Design: A Case Study on Maze Navigation Zhen Ni, Xiangnan Zhong, and Haibo He Department of Electrical, Computer, and Biomedical Engineering University of Rhode Island Kingston, RI, USA 02881 Email: {ni,xzhong,he}@ele.uri.edu Abstract—A new theoretical analysis towards the goal representation adaptive dynamic programming (GrADP) design proposed in [1], [2] is investigated in this paper. Unlike the proofs of convergence for adaptive dynamic programming (ADP) in literature, here we provide a new insight for the error bound between the estimated value function and the expected value function. Then we employ the critic network in GrADP approach to approximate the Q value function, and use the action network to provide the control policy. The goal network is adopted to provide the internal reinforcement signal for the critic network over time. Finally, we illustrate that the estimated Q value function is close to the expected value function in an arbitrary small bound on the maze navigation example.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Melatonin improves spatial navigation memory in male diabetic rats

The aim of the present study was to evaluate the effect of melatonin as an antioxidant on spatial navigation memory in male diabetic rats. Thirty-two male white Wistar rats weighing 200 ± 20 g were divided into four groups, randomly: control, melatonin, diabetic and melatonin-treated diabetic. Experimental diabetes was induced by intraperitoneal injection of 50 mg kg-1 streptozotocin. Melatonin...

متن کامل

Vitamin D Deficiency Impairs Spatial Learning in Adult Rats

Background: Through its membrane and intracellular receptors, vitamin D regulates many vital functions in the body including its well known actions on musculoskeletal system. Growing body of evidences demonstrate that vitamin D undergoes some of behavioral aspects of neurocognition. The present study was designed to evaluate the effect of food regimens without vitamin D or with a supplement of ...

متن کامل

Practical Evaluation of EKF1 and UKF2 Filters for Terrain Aided Navigation

This article would study batch and recursive methods that used in terrain navigation systems. Terrain navigation has a lot ofdisadvantages and so researchers have been studied on different method of aided navigation for many years. Therefore, more types of aided navigation systems were introduced with advantages and disadvantages in terms of practical and theoretical. One of the main ideas for ...

متن کامل

Information Architecture of Research Institutes’ Website, Case Study: Iranian Research Institute for Information Science and Technology’s Website

Purpose: As mission-oriented organizations, research institutes have the task of answering community questions in specialized areas, and should therefore be able to effectively present their outputs to their target users. Achieving such a goal requires the proper use of information architecture principles to properly organize the information platform in which the research institutes interact wi...

متن کامل

Navigation in Determining the Physical Factors Affecting Creativity of Children's in Urban Parks

Despite the availability of extensive facilities for children, the effect of environment on creativityof children is often ignored. It is a fact that children can attend the playgrounds in city parks, independently, from age6, therefore they become exposed to influence of the environment during this age period. It is necessary to designplaygrounds for children to improve their creativity. The o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015